Tutorial: Spatial-CUT&Tag via DBiT-seq

This tutorial provides a brief introduction to epigenomic analysis of experiments performed via Deterministic Barcoding in Tissue for Spatial Omics Sequencing (DBiT-seq). We use the ArchR and Seurat packages to create a spatially resolved analysis object in which epigenetic information is mapped to the tissue histology. This analysis follows standard scATAC downstream analysis as outlined in the ArchR and Seurat tutorials.

Here we present the analysis of a spatial CUT&Tag experiment with triplicate mouse brain sections. The sections were profiled with an antibody against H3K27ac (activating enhancers and/or promoters). We demonstrate:

First, we set the working directory to the local tutorial directory and load required packages.

setwd("~/atlas-git/ATX_ATAC-seq/analysis/") # change to local tutorial directory

library(ArchR)
library(dplyr)
library(ggpubr)
library(grid)
library(harmony)
library(knitr)
library(magick)
library(Matrix)
library(patchwork)
library(pheatmap)
library(purrr)
library(rmarkdown)
library(Seurat)

source("utils.R")

Set up environment, set global variables

This tutorial assumes you have created fragments files from a epigenomic alignment and preprocessing pipeline (ie. Chromap, Cell Ranger ATAC), and ‘spatial’ folders via our AtlasXBrowser app. For our custom alignment and preprocessing workflow, see fastq2frags/ in this repository.

This tutorial assumes that example data (fragments, spatial folders) is saved in in this repository in the structure described in the README.md in this directory. To access example data, please contact your AtlasXomics support scientist or contact .

Here, we set ArchR global variables, and three character vectors containing sample info for analysis.

  • addArchRThreads: number of threads to use for paralelle processing; by defaults threads is set to one half of available threads. ArchR recommends setting treads to 1/2-3/4 total available cores.

  • addArchRGenome: reference genome to be used for gene and genome annotations; Archr natively supports hg19, hg38, mm9, and mm10 and allows for the creation of custom references. Here we use “mm10” (BSgenome.Mmusculus.UCSC.mm10). For more information on ArchR reference genomes, see ArchR documentation.

  • run_ids: identifiers for the runs/experiments included in the analysis; here we used standard AtlasXomics run identifiers (ie. Dxxxxx_NGxxxxx).

  • fragments_paths: local path to the fragments.tsv.gz files generated from a preprocessing and alignment pipeline; this tutorial assumes the file paths shown in the README.md.

  • spatial_dirs: local paths to the spatial directories generated by AtlasXBrowser. These directories contain images for the tissue ROI, run metadata, image scale factors, and a positions file (see below) needed to create a Seurat Object; more information on spatial Seurat Objects can be found here. This tutorial assumes the file paths shown in the README.md.

  • position_files: local paths to the tissue_positions_list.csv for each experiment; these files contain tixel coordinates and on/off tissue designations for each tixel.

addArchRThreads(threads = 16)
addArchRGenome("mm10")

run_ids <- c(
  "D01208",
  "D01209",
  "D01210"
)

fragment_paths <- c(
  "./fragments/cleaned_D01208_NG02241_fragments.tsv.gz",
  "./fragments/cleaned_D01209_NG02242_fragments.tsv.gz",
  "./fragments/cleaned_D01210_NG02243_fragments.tsv.gz"
)

spatial_dirs <- c(
  "./spatials/D1208/spatial/",
  "./spatials/D1209/spatial/",
  "./spatials/D1210/spatial/"
)

position_files <- c(
  "./spatials/D1208/spatial/tissue_positions_list.csv",
  "./spatials/D1209/spatial/tissue_positions_list.csv",
  "./spatials/D1210/spatial/tissue_positions_list.csv"
)

ArchRProject generation

Generate Arrow files from fragment files

Arrow files are the basic unit of analysis in ArchR. Arrow files save all sample data on disk as HDF5 files and are updated with additional layers as analysis progress. Arrow files are associated with a ArchRProject which can be accessed by R and stored in memory.

For each sample, an Arrow file is generated from a fragment file. Fragment file paths and run ids are supplied to createArrowFiles() as character vectors.

Parameters minTSS and minFrags can used to remove low-quality tixels from analysis.

By default, createArrowFiles() generates a TileMatrix (fragment counts per tile/bin per cell) and a GeneScoreMatrix (computed gene activity per cell) for each sample; here we increase the size of tiles from 500 to 5000 according to Deng, 2022.

ArrowFiles <- createArrowFiles(
  inputFiles = fragment_paths,
  sampleNames = run_ids,
  minTSS = 2,
  minFrags = 0,
  maxFrags = 1e+07,
  TileMatParams = list(tileSize = 5000),
  force = TRUE
)

Quality control plot PDFs are saved in ./QualityControl for each sample during Arrow file creation.

Create ArchRProject

ArchR accesses data by associating the Arrow files on disk with an ArchRProject in memory.

proj <- ArchRProject(
  ArrowFiles = ArrowFiles,
  outputDirectory = "ArchRProject"
)

Filter “off-tissue” tixels

The tissue_positions.csv file generated via AtlasXBrowser is used to remove ‘off-tissue’ tixels from analysis.

all_ontissue <- c()
for (i in seq_along(position_files)) {
  positions <- read.csv(position_files[i], header = FALSE)
  positions$V1 <- paste(run_ids[i], "#", positions$V1, "-1", sep = "")
  on_tissue <- positions$V1 [which(positions$V2 == 1)]
  all_ontissue <- c(all_ontissue, on_tissue)
}
proj <- proj[proj$cellNames %in% all_ontissue]

QC plots

Plots of log10(unique nuclear fragments) vs TSS enrichment score and fragment size distribution per sample can be found in the “QualityControl” folder in working directory. Combined plots can also be generated.

plotTSSEnrichment(proj)
## ArchR logging to : ArchRLogs/ArchR-plotTSSEnrichment-1c2b2ddba60d-Date-2023-09-01_Time-12-19-22.718944.log
## If there is an issue, please report to github with logFile!
## ArchR logging successful to : ArchRLogs/ArchR-plotTSSEnrichment-1c2b2ddba60d-Date-2023-09-01_Time-12-19-22.718944.log

plotFragmentSizes(proj)
## ArchR logging to : ArchRLogs/ArchR-plotFragmentSizes-1c2b2f72b700-Date-2023-09-01_Time-12-20-17.583585.log
## If there is an issue, please report to github with logFile!
## ArchR logging successful to : ArchRLogs/ArchR-plotFragmentSizes-1c2b2f72b700-Date-2023-09-01_Time-12-20-17.583585.log

Dimensionality reduction, clustering, UMAP plotting

Dimension reduction performed with ArchR LSI function; batch correction performed with Harmony. Seurat’s FindClusters() function is used for graph clustering.

proj <- addIterativeLSI(
  ArchRProj = proj,
  useMatrix = "TileMatrix",
  name = "IterativeLSI",
  iterations = 2,
  clusterParams = list(
    resolution = c(0.5),
    sampleCells = 10000,
    n.start = 10
  ), 
  varFeatures = 50000,
  dimsToUse = 1:30,
  force = TRUE
)

proj <- addHarmony(
  ArchRProj = proj,
  reducedDims = "IterativeLSI",
  name = "Harmony",
  groupBy = "Sample",
  force = TRUE
)

proj <- addClusters(
  input = proj,
  reducedDims = "Harmony",
  method = "Seurat",
  name = "Clusters",
  resolution = c(1.0),
  force = TRUE
)
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
## 
## Number of nodes: 7308
## Number of edges: 386686
## 
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.7675
## Number of communities: 11
## Elapsed time: 0 seconds
proj <- addUMAP(
  ArchRProj = proj,
  reducedDims = "Harmony",
  name = "UMAP",
  nNeighbors = 30,
  minDist = 0.0,
  metric = "cosine",
  force = TRUE
)

Plot UMAP and cluster distribution

The UMAP can be visualized and colored by sample and clusters.

# plot the UMAP, colored by sample
p1 <- plotEmbedding(
  ArchRProj = proj,
  colorBy = "cellColData",
  name = "Sample",
  embedding = "UMAP"
)
## ArchR logging to : ArchRLogs/ArchR-plotEmbedding-1c2b14f6f256-Date-2023-09-01_Time-12-22-57.329922.log
## If there is an issue, please report to github with logFile!
## Getting UMAP Embedding
## ColorBy = cellColData
## Plotting Embedding
## 1 
## ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-1c2b14f6f256-Date-2023-09-01_Time-12-22-57.329922.log
# plot the UMAP, colored by clusters
p2 <- plotEmbedding(
  ArchRProj = proj,
  colorBy = "cellColData",
  name = "Clusters",
  embedding = "UMAP"
)
## ArchR logging to : ArchRLogs/ArchR-plotEmbedding-1c2b609ff6da-Date-2023-09-01_Time-12-22-57.550052.log
## If there is an issue, please report to github with logFile!
## Getting UMAP Embedding
## ColorBy = cellColData
## Plotting Embedding
## 1 
## ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-1c2b609ff6da-Date-2023-09-01_Time-12-22-57.550052.log
p1 + p2

Plot cluster distribution by sample

df1 <- as.data.frame(proj@cellColData)
n_clusters <- length(unique(proj$Clusters))
colors <- ArchRPalettes$stallion[as.character(seq_len(n_clusters))]
names(colors) <- paste0("C", seq_len(n_clusters))

df2 <- df1 %>% group_by(Sample, Clusters) %>%
  summarise(total_count = n(), .groups = "drop") %>%
  as.data.frame()

comp1 <- ggplot(df2, aes(fill = Clusters, y = total_count, x = Sample)) +
  geom_bar(position = "stack", stat = "identity") +
  scale_fill_manual(values = colors)

comp2 <- ggplot(df2, aes(fill = Sample, y = total_count, x = Clusters)) +
  geom_bar(position = "stack", stat = "identity")

comp3 <- ggplot(df2, aes(fill = Sample, y = total_count, x = Clusters)) +
  geom_bar(position = "fill", stat = "identity")

comp1 + comp2 + comp3

cluster_distribution
cluster_distribution

It’s a good idea to frequently save your ArchRProject, especially after running expensive computations.

saveArchRProject(
  ArchRProj = proj,
  outputDirectory = "ArchRProject",
  load = FALSE
)
## Saving ArchRProject...

SeuratObject visualization

Build SeuratObjects

Create a metadata table for SeuratObject.

metadata <- getCellColData(ArchRProj = proj)
rownames(metadata) <- str_split_fixed(
  str_split_fixed(
    row.names(metadata),
    "#",
    2)[, 2],
  "-",
  2)[, 1]
metadata["log10_nFrags"] <- log(metadata$nFrags)

Create a gene matrix for Seurat object.

proj <- addImputeWeights(proj, reducedDims = "Harmony")
## ArchR logging to : ArchRLogs/ArchR-addImputeWeights-1c2b45e5b39b-Date-2023-09-01_Time-12-23-00.48614.log
## If there is an issue, please report to github with logFile!
## 2023-09-01 12:23:00.521053 : Computing Impute Weights Using Magic (Cell 2018), 0 mins elapsed.
gene_matrix <- getMatrixFromProject(
  ArchRProj = proj,
  useMatrix = "GeneScoreMatrix"
)
## ArchR logging to : ArchRLogs/ArchR-getMatrixFromProject-1c2b6b72a8f6-Date-2023-09-01_Time-12-23-04.815204.log
## If there is an issue, please report to github with logFile!
## 2023-09-01 12:23:26.920378 : Organizing colData, 0.368 mins elapsed.
## 2023-09-01 12:23:26.9438 : Organizing rowData, 0.369 mins elapsed.
## 2023-09-01 12:23:26.945444 : Organizing rowRanges, 0.369 mins elapsed.
## 2023-09-01 12:23:26.948891 : Organizing Assays (1 of 1), 0.369 mins elapsed.
## 2023-09-01 12:23:28.290411 : Constructing SummarizedExperiment, 0.391 mins elapsed.
## 2023-09-01 12:23:28.646794 : Finished Matrix Creation, 0.397 mins elapsed.
matrix <- imputeMatrix(
  mat = assay(gene_matrix),
  imputeWeights = getImputeWeights(proj)
)
## Getting ImputeWeights
## ArchR logging to : ArchRLogs/ArchR-imputeMatrix-1c2b34b2e596-Date-2023-09-01_Time-12-23-28.651579.log
## If there is an issue, please report to github with logFile!
## Using weights on disk
## Using weights on disk
rownames(matrix) <- gene_matrix@elementMetadata$name

Build SeuratObjects with build_atlas_seurat_object() from utils.R

seurat_objs <- c()
for (i in seq_along(run_ids)) {

  obj <- build_atlas_seurat_object(
    run_id = run_ids[i],
    matrix = matrix,
    metadata = metadata,
    spatial_path = spatial_dirs[i]
  )
  seurat_objs <- c(seurat_objs, obj)
}

Spatial cluster plots

Plot the clusters identities of each tixel overlaid on top of the tissue image with spatial_plot() from utils.R; this functions call the Seurat function SpatialDimPlot.

spatial_cluster_plots <- list()
for (i in seq_along(run_ids)){
  plot <- spatial_plot(seurat_objs[[i]], run_ids[i])
  spatial_cluster_plots[[i]] <- plot
}

ggarrange(
  plotlist = spatial_cluster_plots,
  ncol = 3,
  nrow = 1,
  common.legend = TRUE,
  legend = "bottom"
)

spatialdimplots
spatialdimplots

Spatial QC plots

Plot qc metrics of each tixel overlaid on top of the tissue image with feature_plot() from utils.R; this functions call the Seurat function SpatialFeaturePlot. QC metrics to plot include:

  • TSSEnrichment
  • nFrags
  • log10_nFrags
metric <- "TSSEnrichment"

spatial_qc_plots <- list()
for (i in seq_along(run_ids)){
  plot <- feature_plot(seurat_objs[[i]], metric, run_ids[i])
  spatial_qc_plots[[i]] <- plot
}

ggarrange(plotlist = spatial_qc_plots, ncol = 3, nrow = 1, legend = "right")

tssenrichment
tssenrichment

Spatial genes plots

gene <- "Opalin"

spatial_gene_plots <- list()
for (i in seq_along(run_ids)){
  plot <- feature_plot(seurat_objs[[i]], gene, run_ids[i])
  spatial_gene_plots[[i]] <- plot
}

ggarrange(plotlist = spatial_gene_plots, ncol = 3, nrow = 1, legend = "right")

spatial_gene_plo
spatial_gene_plo

Differential gene regulation

Identify marker genes

Extract gene scores and identify marker genes with thresholds FDR <= 0.05, Log2FC >= 0.2. getMarkerFeatures() returns a SummarizedExperiment object for downstream analysis; getMarkers() converts the SummarizedExperiment to a DataFrame that can be saved for later analysis.

markersGS <- getMarkerFeatures(
  ArchRProj = proj,
  useMatrix = "GeneScoreMatrix",
  groupBy = "Clusters",
  bias = c("TSSEnrichment", "log10(nFrags)"),
  testMethod = "wilcoxon"
)
## ArchR logging to : ArchRLogs/ArchR-getMarkerFeatures-1c2b58530dde-Date-2023-09-01_Time-12-25-08.177468.log
## If there is an issue, please report to github with logFile!
## MatrixClass = Sparse.Double.Matrix
## 2023-09-01 12:25:08.305328 : Matching Known Biases, 0.001 mins elapsed.
## ###########
## 2023-09-01 12:29:07.079649 : Completed Pairwise Tests, 3.981 mins elapsed.
## ###########
## ArchR logging successful to : ArchRLogs/ArchR-getMarkerFeatures-1c2b58530dde-Date-2023-09-01_Time-12-25-08.177468.log
markerList <- getMarkers(markersGS, cutOff = "FDR <= 0.05 & Log2FC >= 0.2")
write.csv(markerList, file = "markerList.csv", row.names = FALSE)

Marker gene heatmaps

A heatmap of all marker genes by cluster can be plotted with plotMarkerHeatmap.

heatmapGS <- plotMarkerHeatmap(
  seMarker = markersGS,
  cutOff = "FDR <= 0.05 & Log2FC >= 0.2",
  transpose = TRUE,
  labelMarkers = NULL
)
## ArchR logging to : ArchRLogs/ArchR-plotMarkerHeatmap-1c2b7a868f40-Date-2023-09-01_Time-12-29-08.277759.log
## If there is an issue, please report to github with logFile!
## Printing Top Marker Genes:
## C1:
##  Defb44-ps, Gm5524, Mir6342, Npas2, 9330175M20Rik, Nab1, Ctla4, Crygf, Tm4sf20, Agfg1, Cntnap5a, Eif2d, Lemd1, Mir135b, Plekha6
## C2:
##  Sntg1, Msc, Terf1, Sbspon, 4930444P10Rik, Crisp4, Pkhd1, 4930486I03Rik, Tmem14a, Gm4956, Col9a1, Col19a1, Lmbrd1, Phf3, Gm9839
## C3:
##  Mybl1, Tmem70, Ly96, Gm4850, Gm5415, Creg2, Tex30, Stat1, 1700019D03Rik, Mstn, Hecw2, Gm13749, Creb1, Pth2r, Wnt10a
## C4:
##  Dner, Reps1, Oaz1, Mir1982, Apaf1, Gm21293, Abr, 1810032O08Rik, Ccdc85c, Fbxl21, Trpc7, Mocs2, Slc4a7, Syt15, Grid1
## C5:
##  Zfp451, 1700066M21Rik, 4930598F16Rik, Mir181a-1, Tada1, Hnrnpu, Krt23, Gm11595, Krtap16-1, Klhl10, Pde6g, Gng4, Rfesd, Ttc37, Lrrc63
## C6:
##  B3gat2, Gm20172, Gm29669, Fer1l5, Cnnm4, Actr1b, Tsga10, Eif5b, Rnf149, Catip, Gm15179, Speg, Ccl20, Iqca, D2hgdh
## C7:
##  Ptp4a1, Dst, BC055402, Klf7, Mir6899, Aamp, Cyp27a1, Dnajb2, Dis3l2, C030007H22Rik, Sned1, Mir6901, Mterf4, Fam174a, 1700063A18Rik
## C8:
##  Oprk1, Pcmtd1, Cops5, Pi15, Lincmd1, Mir133b, Tram2, 1700066B17Rik, 1700019A02Rik, Adam23, Plekhm3, Kansl1l, D530049I02Rik, 6030407O03Rik, Rufy4
## C9:
##  Rrs1, Adhfe1, Tram1, Mcm3, Arid5a, Lyg1, Mob4, Slc11a1, Serpine2, Fam124b, Spata3, Ramp1, Per2, Hdac4, Gpr35
## C10:
##  Sox17, Tfap2b, 4933415F23Rik, Mir30a, Il1r1, Slc9a4, Slc40a1, Stk17b, Aox1, Bzw1, Nop58, Snord11, Bmpr2, Fam117b, Apol7d
## C11:
##  Imp4, Neurl3, Mgat4a, 4930594C11Rik, Il1rl2, Bard1, Pecr, Pid1, Mir8096, Gm19582, Ppip5k2, Sctr, Tmem37, Marco, Rgs1
## Identified 10808 markers!
##   [1] "Defb44-ps"     "Gm5524"        "Mir6342"       "Npas2"        
##   [5] "9330175M20Rik" "Nab1"          "Ctla4"         "Crygf"        
##   [9] "Tm4sf20"       "Agfg1"         "Cntnap5a"      "Eif2d"        
##  [13] "Lemd1"         "Mir135b"       "Plekha6"       "Sntg1"        
##  [17] "Msc"           "Terf1"         "Sbspon"        "4930444P10Rik"
##  [21] "Crisp4"        "Pkhd1"         "4930486I03Rik" "Tmem14a"      
##  [25] "Gm4956"        "Col9a1"        "Col19a1"       "Lmbrd1"       
##  [29] "Phf3"          "Gm9839"        "Mybl1"         "Tmem70"       
##  [33] "Ly96"          "Gm4850"        "Gm5415"        "Creg2"        
##  [37] "Tex30"         "Stat1"         "1700019D03Rik" "Mstn"         
##  [41] "Hecw2"         "Gm13749"       "Creb1"         "Pth2r"        
##  [45] "Wnt10a"        "Dner"          "Reps1"         "Oaz1"         
##  [49] "Mir1982"       "Apaf1"         "Gm21293"       "Abr"          
##  [53] "1810032O08Rik" "Ccdc85c"       "Fbxl21"        "Trpc7"        
##  [57] "Mocs2"         "Slc4a7"        "Syt15"         "Grid1"        
##  [61] "Zfp451"        "1700066M21Rik" "4930598F16Rik" "Mir181a-1"    
##  [65] "Tada1"         "Hnrnpu"        "Krt23"         "Gm11595"      
##  [69] "Krtap16-1"     "Klhl10"        "Pde6g"         "Gng4"         
##  [73] "Rfesd"         "Ttc37"         "Lrrc63"        "B3gat2"       
##  [77] "Gm20172"       "Gm29669"       "Fer1l5"        "Cnnm4"        
##  [81] "Actr1b"        "Tsga10"        "Eif5b"         "Rnf149"       
##  [85] "Catip"         "Gm15179"       "Speg"          "Ccl20"        
##  [89] "Iqca"          "D2hgdh"        "Ptp4a1"        "Dst"          
##  [93] "BC055402"      "Klf7"          "Mir6899"       "Aamp"         
##  [97] "Cyp27a1"       "Dnajb2"        "Dis3l2"        "C030007H22Rik"
## [101] "Sned1"         "Mir6901"       "Mterf4"        "Fam174a"      
## [105] "1700063A18Rik" "Oprk1"         "Pcmtd1"        "Cops5"        
## [109] "Pi15"          "Lincmd1"       "Mir133b"       "Tram2"        
## [113] "1700066B17Rik" "1700019A02Rik" "Adam23"        "Plekhm3"      
## [117] "Kansl1l"       "D530049I02Rik" "6030407O03Rik" "Rufy4"        
## [121] "Rrs1"          "Adhfe1"        "Tram1"         "Mcm3"         
## [125] "Arid5a"        "Lyg1"          "Mob4"          "Slc11a1"      
## [129] "Serpine2"      "Fam124b"       "Spata3"        "Ramp1"        
## [133] "Per2"          "Hdac4"         "Gpr35"         "Sox17"        
## [137] "Tfap2b"        "4933415F23Rik" "Mir30a"        "Il1r1"        
## [141] "Slc9a4"        "Slc40a1"       "Stk17b"        "Aox1"         
## [145] "Bzw1"          "Nop58"         "Snord11"       "Bmpr2"        
## [149] "Fam117b"       "Apol7d"        "Imp4"          "Neurl3"       
## [153] "Mgat4a"        "4930594C11Rik" "Il1rl2"        "Bard1"        
## [157] "Pecr"          "Pid1"          "Mir8096"       "Gm19582"      
## [161] "Ppip5k2"       "Sctr"          "Tmem37"        "Marco"        
## [165] "Rgs1"
## Adding Annotations..
## Preparing Main Heatmap..
## ArchR logging successful to : ArchRLogs/ArchR-plotMarkerHeatmap-1c2b7a868f40-Date-2023-09-01_Time-12-29-08.277759.log
ComplexHeatmap::draw(
  heatmapGS,
  heatmap_legend_side = "bot",
  annotation_legend_side = "bot"
)

markerGenes_all.png
markerGenes_all.png

A subset of markers genes can be plotted as well.

marker_genes_subset  <- c(
    "Tmem119", "Cx3cr1", "Itgam", # microglia
    "Slc1a2", "Gfap", # astrocytes
    "Mbp", "Opalin", "Mog", "Mobp", "Cspg4", "Cldn11", "Olig1", # oligodendrocytes
    "Nefh", "Syt1", "Rbfox3", # neurons
    "Slc17a7", # excitatory neuron
    "Gad1", # inhibitory neuron
    "Pdgfrb", "Ng2", # pericyte
    "Prox1" # denate gyrus
  )

subsetSE <- markersGS[which(rowData(markersGS)$name %in% marker_genes_subset), ]

heatmapGS_subset <- plotMarkerHeatmap(
  seMarker = subsetSE,
  cutOff = "FDR <= 0.05 & Log2FC >= 0.2",
  transpose = TRUE
)
## ArchR logging to : ArchRLogs/ArchR-plotMarkerHeatmap-1c2b7a32a86-Date-2023-09-01_Time-12-29-10.380221.log
## If there is an issue, please report to github with logFile!
## Printing Top Marker Genes:
## C1:
##  Mbp, Syt1, Rbfox3, Prox1, Nefh, Gfap, Olig1, Mog, Pdgfrb, Opalin, Gad1, Slc1a2, Cldn11, Tmem119, Slc17a7
## C2:
##  Slc17a7, Syt1, Rbfox3, Prox1, Nefh, Gfap, Olig1, Mog, Pdgfrb, Mbp, Opalin, Gad1, Slc1a2, Cldn11, Tmem119
## C3:
##  Nefh, Slc1a2, Slc17a7, Rbfox3, Prox1, Syt1, Gfap, Olig1, Mog, Pdgfrb, Mbp, Opalin, Gad1, Cldn11, Tmem119
## C4:
##  Gfap, Prox1, Syt1, Nefh, Rbfox3, Olig1, Mog, Pdgfrb, Mbp, Opalin, Gad1, Slc1a2, Cldn11, Tmem119, Slc17a7
## C5:
##  Cspg4, Mobp, Mog, Opalin, Cldn11, Olig1, Prox1, Syt1, Nefh, Gfap, Rbfox3, Pdgfrb, Mbp, Gad1, Slc1a2
## C6:
##  Syt1, Prox1, Nefh, Gfap, Rbfox3, Olig1, Mog, Pdgfrb, Mbp, Opalin, Gad1, Slc1a2, Cldn11, Tmem119, Slc17a7
## C7:
##  Prox1, Mbp, Gad1, Cspg4, Cx3cr1, Mobp, Gfap, Mog, Opalin, Cldn11, Olig1, Syt1, Nefh, Rbfox3, Pdgfrb
## C8:
##  Prox1, Gad1, Gfap, Mog, Pdgfrb, Opalin, Cldn11, Olig1, Syt1, Nefh, Rbfox3, Mbp, Slc1a2, Tmem119, Slc17a7
## C9:
##  Slc1a2, Pdgfrb, Olig1, Prox1, Syt1, Nefh, Gfap, Rbfox3, Mog, Mbp, Opalin, Gad1, Cldn11, Tmem119, Slc17a7
## C10:
##  Pdgfrb, Prox1, Syt1, Nefh, Gfap, Rbfox3, Olig1, Mog, Mbp, Opalin, Gad1, Slc1a2, Cldn11, Tmem119, Slc17a7
## C11:
##  Tmem119, Cx3cr1, Prox1, Syt1, Nefh, Gfap, Rbfox3, Olig1, Mog, Pdgfrb, Mbp, Opalin, Gad1, Slc1a2, Cldn11
## Identified 18 markers!
##  [1] "Mbp"     "Syt1"    "Rbfox3"  "Prox1"   "Nefh"    "Gfap"    "Olig1"  
##  [8] "Mog"     "Pdgfrb"  "Opalin"  "Gad1"    "Slc1a2"  "Cldn11"  "Tmem119"
## [15] "Slc17a7" "Cspg4"   "Mobp"    "Cx3cr1"
## Adding Annotations..
## Preparing Main Heatmap..
## ArchR logging successful to : ArchRLogs/ArchR-plotMarkerHeatmap-1c2b7a32a86-Date-2023-09-01_Time-12-29-10.380221.log
heatmap(
  as.matrix(heatmapGS_subset@matrix),
  scale = "column",
  col = viridis::viridis(50)
)

markGene_subset
markGene_subset

Genome tracks of marker genes

Local chromatin accessablity can be plotted against genome browser tracks. Here, we plot all genes in marker_genes_subset and save them as PDF in the ArchRProject/Plots directory.

tracks <- plotBrowserTrack(
  ArchRProj = proj,
  groupBy = "Clusters",
  geneSymbol = marker_genes_subset,
  upstream = 50000,
  downstream = 50000
)
## ArchR logging to : ArchRLogs/ArchR-plotBrowserTrack-1c2b154291c6-Date-2023-09-01_Time-12-29-10.566946.log
## If there is an issue, please report to github with logFile!
## 2023-09-01 12:29:10.60424 : Validating Region, 0.001 mins elapsed.
## GRanges object with 19 ranges and 2 metadata columns:
##        seqnames              ranges strand |     gene_id      symbol
##           <Rle>           <IRanges>  <Rle> | <character> <character>
##    [1]     chr5 113793729-113800358      - |      231633     Tmem119
##    [2]     chr9 120048683-120068296      - |       13051      Cx3cr1
##    [3]     chr7 128062640-128128160      + |       16409       Itgam
##    [4]     chr2 102658683-102790784      + |       20511      Slc1a2
##    [5]    chr11 102887336-102897200      - |       14580        Gfap
##    ...      ...                 ...    ... .         ...         ...
##   [15]    chr11 118489760-118911597      - |       52897      Rbfox3
##   [16]     chr7   45163921-45176139      + |       72961     Slc17a7
##   [17]     chr2   70562129-70602014      + |       14415        Gad1
##   [18]    chr18   61045150-61085067      + |       18596      Pdgfrb
##   [19]     chr1 190121775-190170680      - |       19130       Prox1
##   -------
##   seqinfo: 21 sequences from mm10 genome
## 2023-09-01 12:29:10.649719 : Adding Bulk Tracks (1 of 19), 0.001 mins elapsed.
## 2023-09-01 12:29:11.225337 : Adding Gene Tracks (1 of 19), 0.011 mins elapsed.
## 2023-09-01 12:29:11.358785 : Plotting, 0.013 mins elapsed.
## 2023-09-01 12:29:12.337635 : Adding Bulk Tracks (2 of 19), 0.03 mins elapsed.
## 2023-09-01 12:29:12.862264 : Adding Gene Tracks (2 of 19), 0.038 mins elapsed.
## 2023-09-01 12:29:12.967354 : Plotting, 0.04 mins elapsed.
## 2023-09-01 12:29:13.55202 : Adding Bulk Tracks (3 of 19), 0.05 mins elapsed.
## 2023-09-01 12:29:14.035781 : Adding Gene Tracks (3 of 19), 0.058 mins elapsed.
## 2023-09-01 12:29:14.134096 : Plotting, 0.059 mins elapsed.
## 2023-09-01 12:29:14.619611 : Adding Bulk Tracks (4 of 19), 0.068 mins elapsed.
## 2023-09-01 12:29:15.19574 : Adding Gene Tracks (4 of 19), 0.077 mins elapsed.
## 2023-09-01 12:29:15.316594 : Plotting, 0.079 mins elapsed.
## 2023-09-01 12:29:15.906368 : Adding Bulk Tracks (5 of 19), 0.089 mins elapsed.
## 2023-09-01 12:29:16.412609 : Adding Gene Tracks (5 of 19), 0.097 mins elapsed.
## 2023-09-01 12:29:16.511565 : Plotting, 0.099 mins elapsed.
## 2023-09-01 12:29:17.109594 : Adding Bulk Tracks (6 of 19), 0.109 mins elapsed.
## 2023-09-01 12:29:17.495717 : Adding Gene Tracks (6 of 19), 0.115 mins elapsed.
## 2023-09-01 12:29:17.594211 : Plotting, 0.117 mins elapsed.
## 2023-09-01 12:29:18.140412 : Adding Bulk Tracks (7 of 19), 0.126 mins elapsed.
## 2023-09-01 12:29:18.476644 : Adding Gene Tracks (7 of 19), 0.132 mins elapsed.
## 2023-09-01 12:29:18.604591 : Plotting, 0.134 mins elapsed.
## 2023-09-01 12:29:19.154072 : Adding Bulk Tracks (8 of 19), 0.143 mins elapsed.
## 2023-09-01 12:29:19.553103 : Adding Gene Tracks (8 of 19), 0.15 mins elapsed.
## 2023-09-01 12:29:19.654543 : Plotting, 0.151 mins elapsed.
## 2023-09-01 12:29:20.232607 : Adding Bulk Tracks (9 of 19), 0.161 mins elapsed.
## 2023-09-01 12:29:20.7235 : Adding Gene Tracks (9 of 19), 0.169 mins elapsed.
## 2023-09-01 12:29:20.81909 : Plotting, 0.171 mins elapsed.
## 2023-09-01 12:29:21.403412 : Adding Bulk Tracks (10 of 19), 0.181 mins elapsed.
## 2023-09-01 12:29:21.868197 : Adding Gene Tracks (10 of 19), 0.188 mins elapsed.
## 2023-09-01 12:29:21.968705 : Plotting, 0.19 mins elapsed.
## 2023-09-01 12:29:22.5996 : Adding Bulk Tracks (11 of 19), 0.201 mins elapsed.
## 2023-09-01 12:29:23.06226 : Adding Gene Tracks (11 of 19), 0.208 mins elapsed.
## 2023-09-01 12:29:23.162715 : Plotting, 0.21 mins elapsed.
## 2023-09-01 12:29:23.709115 : Adding Bulk Tracks (12 of 19), 0.219 mins elapsed.
## 2023-09-01 12:29:24.098224 : Adding Gene Tracks (12 of 19), 0.226 mins elapsed.
## 2023-09-01 12:29:24.232653 : Plotting, 0.228 mins elapsed.
## 2023-09-01 12:29:24.774699 : Adding Bulk Tracks (13 of 19), 0.237 mins elapsed.
## 2023-09-01 12:29:25.259914 : Adding Gene Tracks (13 of 19), 0.245 mins elapsed.
## 2023-09-01 12:29:25.361854 : Plotting, 0.247 mins elapsed.
## 2023-09-01 12:29:25.949223 : Adding Bulk Tracks (14 of 19), 0.256 mins elapsed.
## 2023-09-01 12:29:26.562391 : Adding Gene Tracks (14 of 19), 0.267 mins elapsed.
## 2023-09-01 12:29:26.656039 : Plotting, 0.268 mins elapsed.
## 2023-09-01 12:29:27.156144 : Adding Bulk Tracks (15 of 19), 0.276 mins elapsed.
## 2023-09-01 12:29:27.644103 : Adding Gene Tracks (15 of 19), 0.285 mins elapsed.
## 2023-09-01 12:29:27.743849 : Plotting, 0.286 mins elapsed.
## 2023-09-01 12:29:28.333063 : Adding Bulk Tracks (16 of 19), 0.296 mins elapsed.
## 2023-09-01 12:29:28.91347 : Adding Gene Tracks (16 of 19), 0.306 mins elapsed.
## 2023-09-01 12:29:29.028592 : Plotting, 0.308 mins elapsed.
## 2023-09-01 12:29:29.68597 : Adding Bulk Tracks (17 of 19), 0.319 mins elapsed.
## 2023-09-01 12:29:30.250037 : Adding Gene Tracks (17 of 19), 0.328 mins elapsed.
## 2023-09-01 12:29:30.359383 : Plotting, 0.33 mins elapsed.
## 2023-09-01 12:29:30.928539 : Adding Bulk Tracks (18 of 19), 0.339 mins elapsed.
## 2023-09-01 12:29:31.32187 : Adding Gene Tracks (18 of 19), 0.346 mins elapsed.
## 2023-09-01 12:29:31.421947 : Plotting, 0.348 mins elapsed.
## 2023-09-01 12:29:32.064669 : Adding Bulk Tracks (19 of 19), 0.358 mins elapsed.
## 2023-09-01 12:29:32.62225 : Adding Gene Tracks (19 of 19), 0.368 mins elapsed.
## 2023-09-01 12:29:32.720554 : Plotting, 0.369 mins elapsed.
## ArchR logging successful to : ArchRLogs/ArchR-plotBrowserTrack-1c2b154291c6-Date-2023-09-01_Time-12-29-10.566946.log
# save tracks to pdf
plotPDF(
  tracks,
  ArchRProj = proj,
  length = 6,
  name = "Gene_Tracks",
  addDOC = FALSE
)
## Plotting Gtable!
## NULL
## Plotting Gtable!
## NULL
## Plotting Gtable!
## NULL
## Plotting Gtable!
## NULL
## Plotting Gtable!
## NULL
## Plotting Gtable!
## NULL
## Plotting Gtable!
## NULL
## Plotting Gtable!
## NULL
## Plotting Gtable!
## NULL
## Plotting Gtable!
## NULL
## Plotting Gtable!
## NULL
## Plotting Gtable!
## NULL
## Plotting Gtable!
## NULL
## Plotting Gtable!
## NULL
## Plotting Gtable!
## NULL
## Plotting Gtable!
## NULL
## Plotting Gtable!
## NULL
## Plotting Gtable!
## NULL
## Plotting Gtable!
## NULL
## Plotting Other
## [1] 6

grid can be used to plot specific genes from the list.

grid::grid.newpage()
grid::grid.draw(tracks$Olig1)

olig1_track
olig1_track

Save your project.

saveArchRProject(
  ArchRProj = proj,
  outputDirectory = "ArchRProject",
  load = FALSE
)
## Saving ArchRProject...

Peak Calling

Pseudo-bulk replicates must be created for our clusters before peak calling can be performed; they are added to the ArchRProject with the addGroupCoverages() function. Peak calling is performed with MACS2; specifically, we have found MACS2 v-2.2.6 to be compatible with ArchR. The function findMacs2() can be used to find the path to your MACS2 instillation.

proj <- addImputeWeights(proj)
## ArchR logging to : ArchRLogs/ArchR-addImputeWeights-1c2b65721a81-Date-2023-09-01_Time-12-29-36.841089.log
## If there is an issue, please report to github with logFile!
## 2023-09-01 12:29:36.87648 : Computing Impute Weights Using Magic (Cell 2018), 0 mins elapsed.
proj <- addGroupCoverages(
  ArchRProj = proj,
  groupBy = "Clusters",
  force = TRUE
)
## ArchR logging to : ArchRLogs/ArchR-addGroupCoverages-1c2b262a3f2f-Date-2023-09-01_Time-12-29-41.170933.log
## If there is an issue, please report to github with logFile!
## C1 (1 of 11) : CellGroups N = 3
## C2 (2 of 11) : CellGroups N = 3
## C3 (3 of 11) : CellGroups N = 3
## C4 (4 of 11) : CellGroups N = 3
## C5 (5 of 11) : CellGroups N = 3
## C6 (6 of 11) : CellGroups N = 3
## C7 (7 of 11) : CellGroups N = 3
## C8 (8 of 11) : CellGroups N = 3
## C9 (9 of 11) : CellGroups N = 3
## C10 (10 of 11) : CellGroups N = 3
## C11 (11 of 11) : CellGroups N = 3
## 2023-09-01 12:29:41.856679 : Creating Coverage Files!, 0.011 mins elapsed.
## 2023-09-01 12:29:41.857543 : Batch Execution w/ safelapply!, 0.011 mins elapsed.
## 2023-09-01 12:45:32.850498 : Adding Kmer Bias to Coverage Files!, 15.861 mins elapsed.
## Completed Kmer Bias Calculation
## Adding Kmer Bias (1 of 33)
## Adding Kmer Bias (2 of 33)
## Adding Kmer Bias (3 of 33)
## Adding Kmer Bias (4 of 33)
## Adding Kmer Bias (5 of 33)
## Adding Kmer Bias (6 of 33)
## Adding Kmer Bias (7 of 33)
## Adding Kmer Bias (8 of 33)
## Adding Kmer Bias (9 of 33)
## Adding Kmer Bias (10 of 33)
## Adding Kmer Bias (11 of 33)
## Adding Kmer Bias (12 of 33)
## Adding Kmer Bias (13 of 33)
## Adding Kmer Bias (14 of 33)
## Adding Kmer Bias (15 of 33)
## Adding Kmer Bias (16 of 33)
## Adding Kmer Bias (17 of 33)
## Adding Kmer Bias (18 of 33)
## Adding Kmer Bias (19 of 33)
## Adding Kmer Bias (20 of 33)
## Adding Kmer Bias (21 of 33)
## Adding Kmer Bias (22 of 33)
## Adding Kmer Bias (23 of 33)
## Adding Kmer Bias (24 of 33)
## Adding Kmer Bias (25 of 33)
## Adding Kmer Bias (26 of 33)
## Adding Kmer Bias (27 of 33)
## Adding Kmer Bias (28 of 33)
## Adding Kmer Bias (29 of 33)
## Adding Kmer Bias (30 of 33)
## Adding Kmer Bias (31 of 33)
## Adding Kmer Bias (32 of 33)
## Adding Kmer Bias (33 of 33)
## 2023-09-01 12:58:17.558957 : Finished Creation of Coverage Files!, 28.606 mins elapsed.
## ArchR logging successful to : ArchRLogs/ArchR-addGroupCoverages-1c2b262a3f2f-Date-2023-09-01_Time-12-29-41.170933.log
# Set to local macs2 installation
pathToMacs2 <- findMacs2()
## Searching For MACS2..
## Found with $path!
proj <- addReproduciblePeakSet(
  ArchRProj = proj,
  groupBy = "Clusters",
  pathToMacs2 = pathToMacs2
)
## ArchR logging to : ArchRLogs/ArchR-addReproduciblePeakSet-1c2b1f9a3a3c-Date-2023-09-01_Time-12-58-17.882179.log
## If there is an issue, please report to github with logFile!
## Calling Peaks with Macs2
## 2023-09-01 12:58:18.001521 : Peak Calling Parameters!, 0.002 mins elapsed.
##     Group nCells nCellsUsed nReplicates nMin nMax maxPeaks
## C1     C1   1780       1500           3  500  500   150000
## C2     C2    507        507           3  168  171   150000
## C3     C3    776        776           3  255  265   150000
## C4     C4    882        882           3  272  333   150000
## C5     C5    894        894           3  288  308   150000
## C6     C6    490        490           3  153  184   150000
## C7     C7    514        514           3  164  186   150000
## C8     C8    282        282           3   67  108   141000
## C9     C9    255        255           3   82   90   127500
## C10   C10    741        741           3  224  264   150000
## C11   C11    187        187           3   49   70    93500
## 2023-09-01 12:58:18.005109 : Batching Peak Calls!, 0.002 mins elapsed.
## 2023-09-01 12:58:18.009015 : Batch Execution w/ safelapply!, 0 mins elapsed.
## 2023-09-01 13:02:17.288111 : Identifying Reproducible Peaks!, 3.99 mins elapsed.
## 2023-09-01 13:02:30.707594 : Creating Union Peak Set!, 4.214 mins elapsed.
## Converged after 10 iterations!
## Plotting Ggplot!
## 2023-09-01 13:02:35.77185 : Finished Creating Union Peak Set (270765)!, 4.298 mins elapsed.

Plot peak distribution amoung clusters

peak_distribution <- proj@peakSet@metadata$PeakCallSummary

comp1_peak <- ggplot(
  peak_distribution,
  aes(fill = Var1, y = Freq, x = Group)
  ) +
    geom_bar(position = "stack", stat = "identity")

comp1_peak

peak_dist
peak_dist

Identify marker peaks

Extract marker peaks with thresholds FDR <= 0.05, Log2FC >= 0.2; save to a csv for later analysis.

proj <- addPeakMatrix(proj)
## ArchR logging to : ArchRLogs/ArchR-addPeakMatrix-1c2b3cf2f352-Date-2023-09-01_Time-13-02-35.975031.log
## If there is an issue, please report to github with logFile!
## 2023-09-01 13:02:36.073573 : Batch Execution w/ safelapply!, 0 mins elapsed.
## ArchR logging successful to : ArchRLogs/ArchR-addPeakMatrix-1c2b3cf2f352-Date-2023-09-01_Time-13-02-35.975031.log
markersPeaks <- getMarkerFeatures(
  ArchRProj = proj,
  useMatrix = "PeakMatrix", 
  groupBy = "Clusters",
  bias = c("TSSEnrichment", "log10(nFrags)"),
  testMethod = "wilcoxon"
)
## ArchR logging to : ArchRLogs/ArchR-getMarkerFeatures-1c2b7244abc0-Date-2023-09-01_Time-13-03-37.345664.log
## If there is an issue, please report to github with logFile!
## MatrixClass = Sparse.Integer.Matrix
## 2023-09-01 13:03:37.649693 : Matching Known Biases, 0.004 mins elapsed.
## ###########
## 2023-09-01 13:09:21.643465 : Completed Pairwise Tests, 5.737 mins elapsed.
## ###########
## ArchR logging successful to : ArchRLogs/ArchR-getMarkerFeatures-1c2b7244abc0-Date-2023-09-01_Time-13-03-37.345664.log
markerpeakList <- getMarkers(
  markersPeaks,
  cutOff = "FDR <= 0.05 & Log2FC >= 1"
)

write.csv(
  markerpeakList,
  file = paste0("markerpeakList.csv"),
  row.names = FALSE
)

# Collect data with annotations
peak_data = data.frame(proj@peakSet@ranges, proj@peakSet@elementMetadata)
total <- merge(peak_data, markerpeakList, by = c("start", "end"))

write.csv(
  total,
  file = paste0("complete_peak_list.csv"),
  row.names = FALSE
)

Plot marker peaks

Create a heatmap of differentially regulated peaks.

heatmap_peaks <- plotMarkerHeatmap(
  seMarker = markersPeaks,
  cutOff = "FDR <= 0.05 & Log2FC >= 1",
  transpose = TRUE
)
## ArchR logging to : ArchRLogs/ArchR-plotMarkerHeatmap-1c2b431704a5-Date-2023-09-01_Time-13-09-25.557298.log
## If there is an issue, please report to github with logFile!
## Identified 13805 markers!
##   [1] "chr1:93698494-93698994"    "chr10:82661628-82662128"  
##   [3] "chr11:32219118-32219618"   "chr11:101990200-101990700"
##   [5] "chr11:118894180-118894680" "chr13:37158757-37159257"  
##   [7] "chr15:58757131-58757631"   "chr15:84079464-84079964"  
##   [9] "chr16:90580336-90580836"   "chr17:26634484-26634984"  
##  [11] "chr17:93200726-93201226"   "chr18:24763456-24763956"  
##  [13] "chr18:24853089-24853589"   "chr2:77841336-77841836"   
##  [15] "chr2:168424207-168424707"  "chr1:10164294-10164794"   
##  [17] "chr1:13599289-13599789"    "chr1:15424295-15424795"   
##  [19] "chr1:15474518-15475018"    "chr1:15483335-15483835"   
##  [21] "chr1:15710333-15710833"    "chr1:15770301-15770801"   
##  [23] "chr1:15776664-15777164"    "chr1:16092900-16093400"   
##  [25] "chr1:16518526-16519026"    "chr1:16987547-16988047"   
##  [27] "chr1:18578745-18579245"    "chr1:20135886-20136386"   
##  [29] "chr1:20376464-20376964"    "chr1:21423619-21424119"   
##  [31] "chr1:9642018-9642518"      "chr1:9645731-9646231"     
##  [33] "chr1:10570835-10571335"    "chr1:10595968-10596468"   
##  [35] "chr1:10719714-10720214"    "chr1:12619886-12620386"   
##  [37] "chr1:12691693-12692193"    "chr1:12692799-12693299"   
##  [39] "chr1:12694519-12695019"    "chr1:12770993-12771493"   
##  [41] "chr1:16664948-16665448"    "chr1:22675128-22675628"   
##  [43] "chr1:23942940-23943440"    "chr1:27475262-27475762"   
##  [45] "chr1:32172388-32172888"    "chr1:4412356-4412856"     
##  [47] "chr1:4492890-4493390"      "chr1:4571413-4571913"     
##  [49] "chr1:4654095-4654595"      "chr1:4671709-4672209"     
##  [51] "chr1:4722536-4723036"      "chr1:5017785-5018285"     
##  [53] "chr1:5018410-5018910"      "chr1:5020580-5021080"     
##  [55] "chr1:5021303-5021803"      "chr1:5024117-5024617"     
##  [57] "chr1:6486375-6486875"      "chr1:6486908-6487408"     
##  [59] "chr18:64390089-64390589"   "chr19:10242824-10243324"  
##  [61] "chr1:62809626-62810126"    "chr1:105420582-105421082" 
##  [63] "chr1:168358194-168358694"  "chr10:7052213-7052713"    
##  [65] "chr10:76316202-76316702"   "chr14:20047683-20048183"  
##  [67] "chr14:27228687-27229187"   "chr14:48090990-48091490"  
##  [69] "chr14:120685038-120685538" "chr15:12093582-12094082"  
##  [71] "chr18:25483611-25484111"   "chr19:10773456-10773956"  
##  [73] "chr19:41648157-41648657"   "chr19:53902290-53902790"  
##  [75] "chr2:153038229-153038729"  "chr1:10102097-10102597"   
##  [77] "chr1:17009652-17010152"    "chr1:24100737-24101237"   
##  [79] "chr1:24195754-24196254"    "chr1:25532294-25532794"   
##  [81] "chr1:30936500-30937000"    "chr1:33042385-33042885"   
##  [83] "chr1:33480513-33481013"    "chr1:34016599-34017099"   
##  [85] "chr1:34021485-34021985"    "chr1:34120051-34120551"   
##  [87] "chr1:34123749-34124249"    "chr1:11044844-11045344"   
##  [89] "chr1:13372624-13373124"    "chr1:21078648-21079148"   
##  [91] "chr1:24938575-24939075"    "chr1:34665138-34665638"   
##  [93] "chr1:40790471-40790971"    "chr1:41275108-41275608"   
##  [95] "chr1:41676012-41676512"    "chr1:41742962-41743462"   
##  [97] "chr1:41891683-41892183"    "chr14:65889688-65890188"  
##  [99] "chr14:65922941-65923441"   "chr3:115844726-115845226" 
## [101] "chr7:110779733-110780233"  "chr1:10993263-10993763"   
## [103] "chr1:16687913-16688413"    "chr1:23109207-23109707"   
## [105] "chr1:34858618-34859118"    "chr1:36568902-36569402"   
## [107] "chr1:39713919-39714419"    "chr1:39761965-39762465"   
## [109] "chr1:39901166-39901666"    "chr1:45855912-45856412"   
## [111] "chr1:45868452-45868952"
## Adding Annotations..
## Preparing Main Heatmap..
## ArchR logging successful to : ArchRLogs/ArchR-plotMarkerHeatmap-1c2b431704a5-Date-2023-09-01_Time-13-09-25.557298.log
draw(heatmap_peaks, heatmap_legend_side = "top", show_heatmap_legend = FALSE)

plotPDF(
  heatmap_peaks,
  name = "peaks_heatmap",
  width = 10,
  length = 6,
  ArchRProj = proj,
  addDOC = FALSE
)
## Plotting ComplexHeatmap!
## Plotting Other
## [1] 6
peak_heatmap
peak_heatmap

Motif Enrichment

Motif annotations can be added to an ArchRProject with addMotifAnnotations().

proj <- addMotifAnnotations(
  ArchRProj = proj,
  motifSet = "cisbp",
  name = "Motif",
  force = TRUE
)
## ArchR logging to : ArchRLogs/ArchR-addMotifAnnotations-1c2b61559840-Date-2023-09-01_Time-13-09-33.690868.log
## If there is an issue, please report to github with logFile!
## 2023-09-01 13:09:34.578279 : Gettting Motif Set, Species : Mus musculus, 0.001 mins elapsed.
## Using version 2 motifs!
## 2023-09-01 13:09:35.251894 : Finding Motif Positions with motifmatchr!, 0.012 mins elapsed.
## 2023-09-01 13:11:35.074993 : All Motifs Overlap at least 1 peak!, 2.009 mins elapsed.
## 2023-09-01 13:11:35.076537 : Creating Motif Overlap Matrix, 2.009 mins elapsed.
## 2023-09-01 13:11:37.185951 : Finished Getting Motif Info!, 2.044 mins elapsed.
## ArchR logging successful to : ArchRLogs/ArchR-addMotifAnnotations-1c2b61559840-Date-2023-09-01_Time-13-09-33.690868.log

Perform motif enrichment in marker peaks

Compute per-cell deviations across all of our motif annotations using the addDeviationsMatrix() function

proj <- addDeviationsMatrix(
  ArchRProj = proj,
  peakAnnotation = "Motif",
  force = TRUE
)
## Identifying Background Peaks!
## ArchR logging to : ArchRLogs/ArchR-addDeviationsMatrix-1c2b58606f34-Date-2023-09-01_Time-13-11-51.102618.log
## If there is an issue, please report to github with logFile!
## NULL
## 'as(<lgCMatrix>, "dgCMatrix")' is deprecated.
## Use 'as(., "dMatrix")' instead.
## See help("Deprecated") and help("Matrix-deprecated").
## 2023-09-01 13:11:54.109417 : Batch Execution w/ safelapply!, 0 mins elapsed.
## ###########
## 2023-09-01 13:28:44.51383 : Completed Computing Deviations!, 16.89 mins elapsed.
## ###########
## ArchR logging successful to : ArchRLogs/ArchR-addDeviationsMatrix-1c2b58606f34-Date-2023-09-01_Time-13-11-51.102618.log
plotVarDev <- getVarDeviations(
  proj,
  name = "MotifMatrix",
  plot = TRUE
)
## DataFrame with 6 rows and 6 columns
##      seqnames     idx        name combinedVars combinedMeans      rank
##         <Rle> <array>     <array>    <numeric>     <numeric> <integer>
## f305        z     305   Foxj1_305      3.40519     0.0812067         1
## f335        z     335  Gm5294_335      3.40519     0.0812067         2
## f357        z     357   Foxl1_357      3.40519     0.0812067         3
## f104        z     104     Fos_104      3.40436    -0.0416743         4
## f751        z     751    Sox4_751      3.34725     0.0689124         5
## f843        z     843 Smarcc1_843      3.28837    -0.0240090         6
SampleMotifs <- getMarkerFeatures(
  ArchRProj = proj,
  useMatrix = "MotifMatrix",
  groupBy = "Clusters",
  testMethod = "wilcoxon",
  bias = c("TSSEnrichment", "log10(nFrags)"),
  useSeqnames = "z"
)
## ArchR logging to : ArchRLogs/ArchR-getMarkerFeatures-1c2b2324ec40-Date-2023-09-01_Time-13-28-48.205733.log
## If there is an issue, please report to github with logFile!
## MatrixClass = Sparse.Assays.Matrix
## 2023-09-01 13:28:48.356164 : Matching Known Biases, 0 mins elapsed.
## ###########
## 2023-09-01 13:33:34.54378 : Completed Pairwise Tests, 4.77 mins elapsed.
## ###########
## ArchR logging successful to : ArchRLogs/ArchR-getMarkerFeatures-1c2b2324ec40-Date-2023-09-01_Time-13-28-48.205733.log
enrichMotif <- peakAnnoEnrichment(
  seMarker = markersPeaks,
  ArchRProj = proj,
  peakAnnotation = "Motif",
  cutOff = "FDR <= 0.05 & Log2FC >= 0.1"
)
## ArchR logging to : ArchRLogs/ArchR-peakAnnoEnrichment-1c2b4d126fc6-Date-2023-09-01_Time-13-33-34.696984.log
## If there is an issue, please report to github with logFile!
## 2023-09-01 13:33:39.560232 : Computing Enrichments 1 of 11, 0.081 mins elapsed.
## 2023-09-01 13:33:39.678016 : Computing Enrichments 2 of 11, 0.083 mins elapsed.
## 2023-09-01 13:33:39.791526 : Computing Enrichments 3 of 11, 0.085 mins elapsed.
## 2023-09-01 13:33:39.901217 : Computing Enrichments 4 of 11, 0.087 mins elapsed.
## 2023-09-01 13:33:40.00513 : Computing Enrichments 5 of 11, 0.088 mins elapsed.
## 2023-09-01 13:33:40.113028 : Computing Enrichments 6 of 11, 0.09 mins elapsed.
## 2023-09-01 13:33:40.240318 : Computing Enrichments 7 of 11, 0.092 mins elapsed.
## 2023-09-01 13:33:40.361127 : Computing Enrichments 8 of 11, 0.094 mins elapsed.
## 2023-09-01 13:33:40.472303 : Computing Enrichments 9 of 11, 0.096 mins elapsed.
## 2023-09-01 13:33:40.58254 : Computing Enrichments 10 of 11, 0.098 mins elapsed.
## 2023-09-01 13:33:40.705643 : Computing Enrichments 11 of 11, 0.1 mins elapsed.
## ArchR logging successful to : ArchRLogs/ArchR-peakAnnoEnrichment-1c2b4d126fc6-Date-2023-09-01_Time-13-33-34.696984.log

Plot a heatmap of motifs enriched in marker peaks

heatmapEM <- plotEnrichHeatmap(enrichMotif, n = 7, transpose = TRUE)
## ArchR logging to : ArchRLogs/ArchR-plotEnrichHeatmap-1c2b155561af-Date-2023-09-01_Time-13-33-41.038966.log
## If there is an issue, please report to github with logFile!
## Adding Annotations..
## Preparing Main Heatmap..
plotPDF(
  heatmapEM,
  name = "motifs_heatmap",
  width = 10,
  length = 6,
  ArchRProj = proj,
  addDOC = FALSE
)
## Plotting ComplexHeatmap!
## Plotting Other
## [1] 6
heatmapEM

motif_heatmap
motif_heatmap

Save your ArchRProject.

saveArchRProject(
  ArchRProj = proj,
  load = FALSE
)
## Saving ArchRProject...

Approximate cell typing

https://satijalab.org/seurat/reference/addmodulescore

Marker genes:

  • microglia: “Tmem119”, “Cx3cr1”, “Itgam”
  • astrocytes: “Slc1a2”, “Gfap”
  • oligodendrocytes: “Mbp”, “Opalin”, “Mog”, “Mobp”, “Cspg4”, “Cldn11”, “Olig1”
  • neurons: “Nefh”, “Syt1”, “Rbfox3”
  • excitatory neurons: “Slc17a7”
  • inhibitory neuron: “Gad1”
  • pericyte: “Pdgfrb”
  • denate gyrus: “Prox1”
marker_genes <- c("Mbp", "Opalin", "Mog", "Mobp", "Cspg4", "Cldn11", "Olig1")
cell_type <- "oligodendrocytes"

geneset_plots <- list()
for (i in seq_along(run_ids)){
  plot <- geneset_plot(seurat_objs[[i]], marker_genes, cell_type, run_ids[i])
  geneset_plots[[i]] <- plot
}

ggarrange(plotlist = geneset_plots, ncol = 3, nrow = 1, legend = "right")

cell_type
cell_type